Map/Reduce Affinity Propagation Clustering Algorithm

نویسندگان

  • Wei-Chih Hung
  • Chun-Yen Chu
  • Yi-Leh Wu
  • Cheng-Yuan Tang
چکیده

The Affinity Propagation (AP) is a clustering algorithm that does not require pre-set K cluster numbers. We improve the original AP to Map/Reduce Affinity Propagation (MRAP) implemented in Hadoop, a distribute cloud environment. The architecture of MRAP is divided to multiple mappers and one reducer in Hadoop. In the experiments, we compare the clustering result of the proposed MRAP with the K-means method. The experiment results support that the proposed MRAP method has good performance in terms of accuracy and Davies–Bouldin index value. Also, by applying the proposed MRAP method can reduce the number of iterations before convergence for the K-means method irrespective to the data dimensions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Knowledge-Based System for Diagnosis of Breast Cancer by a combination of the Affinity Propagation and Firefly Algorithms

Breast cancer has become a widespread disease around the world in young women. Expert systems, developed by data mining techniques, are valuable tools in diagnosis of breast cancer and can help physicians for decision making process. This paper presents a new hybrid data mining approach to classify two groups of breast cancer patients (malignant and benign). The proposed approach, AP-AMBFA, con...

متن کامل

A parallel attribute reduction algorithm based on Affinity Propagation clustering

As information technology is developing rapidly, massive and high dimensional data sets have appeared in abundance. The existing attribute reduction methods are encountering bottleneck problem of timeliness and spatiality. AP(Affinity Propagation) is an efficient and fast clustering algorithm for large dataset compared with the existing clustering algorithms. This paper discusses attribute clus...

متن کامل

Partition Affinity Propagation for Clustering Large Scale of Data in Digital Library

Data clustering is very useful in helping users visit the large scale of data in digit library. In this paper, we present an improved algorithm for clustering large scale of data set with dense relationship based on Affinity Propagation. First, the input data are divided into several groups and Affinity Propagation is applied to them respectively. Results from first step are grouped together in...

متن کامل

A Graph Clustering Algorithm Providing Scalability

Based on the current studies on the algorithms of the affinity propagation and normalized cut, a new scalable graph clustering method called APANC (Affinity Propagation And Normalized Cut) is proposed in this paper. During the APANC process, we firstly use the “Affinity Propagation” (AP) to preliminarily group the original data in order to reduce the data-scale, and then we further group the re...

متن کامل

A Survey On Seeds Affinity Propagation

Affinity propagation (AP) is a clustering method that can find data centers or clusters by sending messages between pairs of data points. Seed Affinity Propagation is a novel semisupervised text clustering algorithm which is based on AP. AP algorithm couldn’t cope up with part known data direct. Therefore, focusing on this issue a semi-supervised scheme called incremental affinity propagation c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014